Using machine learning techniques for grapheme to phoneme transcription
نویسندگان
چکیده
The renewed interest in grapheme to phoneme conversion (G2P), due to the need of developing multilingual speech synthesizers and recognizers, suggests new approaches more efficient than the traditional rule&exception ones. A number of studies have been performed to investigate the possible use of machine learning techniques to extract phonetic knowledge in a automatic way starting from a lexicon. In this paper, we present the results of our experiments in this research field. Starting from the state of art, our contribution is in the development of a language-independent learning scheme for G2P based on Classification and Regression Trees (CART). To validate our approach, we realized G2P converters for the following languages: British English, American English, French and Brazilian Portuguese.
منابع مشابه
Phoneme-to-grapheme Conversion for Out-of-vocabulary Words in Large Vocabulary Speech Recognition
In this paper, we describe a method to enhance the readability of the textual output in a large vocabulary continuous speech recognition system when out-of-vocabulary words occur. The basic idea is to replace uncertain words in the transcriptions with a phoneme recognition result that is postprocessed using a phoneme-to-grapheme converter. This converter turns phoneme strings into grapheme stri...
متن کاملPhoneme-to-grapheme conversion for out-of-vocabulary words in speech recognition
In this report, we show that Out-Of-Vocabulary items (OOVs), recognized using phoneme recognition, can be reasonably reliably transcribed orthographically using Machine Learning techniques. More specifically, (i) we show baseline performance of a machine learning approach to phoneme-to-grapheme conversion when different levels of artificial noise are added (simulating phoneme recognizer errors)...
متن کاملMachine Learning Based English-to-Korean Transliteration Using Grapheme and Phoneme Information
Machine transliteration is an automatic method to generate characters or words in one alphabetical system for the corresponding characters in another alphabetical system. Machine transliteration can play an important role in natural language application such as information retrieval and machine translation, especially for handling proper nouns and technical terms. The previous works focus on ei...
متن کاملMemory-Based Phoneme-to-Grapheme Conversion A Method for Dealing with Out-of-Vocabulary Items in Speech Recognition
In this paper, we describe a method to enhance the readability of out-of-vocabulary items (OOVs) in the textual output in a large vocabulary continuous speech recognition system. The basic idea is to indicate uncertain words in the transcriptions and replace them with phoneme recognition results that are post-processed using a phoneme-to-grapheme (P2G) converter. We concentrate on the final ste...
متن کاملTreetalk-d: a Machine Learning Approach to Dutch Word Pronunciation
We present experimental results concerning the application of the IGTree decision-tree learning algorithm to Dutch word pronunciation. We evaluate four diierent Dutch word pronunciation systems conngured to test the utility of modularization of grapheme{to{phoneme transcription (G) and stress prediction (S). Both training and testing data are extracted from the CELEX II lexical database. Experi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001